Indirect Reinforcement Learning with Adaptive State Space Partitions
نویسندگان
چکیده
Model-based reinforcement learning can be applied to problems with continuous state spaces by discretizing the spaces with crisp or fuzzy partitions. The manual definition of suitable partitions, however, is often not trivial, since fine partitions lead to a high number of states and thus complex discrete problems, whereas coarse partitions can be unsuitable for the representation of the optimal strategy. In this article a novel model-based reinforcement learning approach, Adaptive Fuzzy Prioritized Sweeping (A-F-PS), is presented. The key idea of the approach is to represent internal models by clustered transitions. From clustered transitions suitable partitions of the state space can be easily derived. Moreover, discretized models corresponding to these partitions can be easily calculated. Clustering is performed with an incremental variant of the fuzzy c-means algorithms, such that transitions need not to be explicitly stored and the A-F-PS approach therefore has moderate storage requirements. The effectiveness of the method is shown by an example from traffic signal control.
منابع مشابه
Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions
The application of Reinforcement Learning (RL) algorithms to learn tasks for robots is often limited by the large dimension of the state space, which may make prohibitive its application on a tabular model. In this paper, we introduce LEAP (Learning Entities Adaptive Partitioning), a model-free learning algorithm that uses overlapping partitions which are dynamically modified to learn near-opti...
متن کاملLearning in Complex Environments through Multiple Adaptive Partitions
When using tabular value functions, the application of Reinforcement Learning (RL) algorithms to real-world problems may have prohibitive memory requirements and learning time. In this paper, we introduce LEAP (Learning Entities Adaptive Partitioning), a novel model-free learning algorithm in which the state space is decomposed into several overlapping partitions which are dynamically modified ...
متن کاملFPGA Implementation of a Network of Neuronlike Adaptive Elements
A well known model of reinforcement learning is called Adap-tive Heuristic Critic learning. It is composed of two so called \neuronlike adaptive elements" and has been used to solve diicult learning control problems. In this paper we present an FPGA design and implementation of such algorithm, and, furthermore, we describe a neurocontroller system composed of a network of neuronlike adaptive el...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملOn the Computational Economics of Reinforcement Learning
Following terminology used in adaptive control , we distinguish between indirect learning methods, which learn explicit models of the dynamic structure of the system to be controlled , and direct learning methods, which do not. We compare an existing indirect method, which uses a conventional dynamic programming algorithm, with a closely related direct reinforcement learning method by applying ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000